A Maximum-Likelihood Method to Correct for Allelic Dropout in Microsatellite Data with No Replicate Genotypes
نویسندگان
چکیده
Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy-Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology.
منابع مشابه
Maximum-likelihood estimation of allelic dropout and false allele error rates from microsatellite genotypes in the absence of reference data.
The importance of quantifying and accounting for stochastic genotyping errors when analyzing microsatellite data is increasingly being recognized. This awareness is motivating the development of data analysis methods that not only take errors into consideration but also recognize the difference between two distinct classes of error, allelic dropout and false alleles. Currently methods to estima...
متن کاملAssessing allelic dropout and genotype reliability using maximum likelihood.
A growing number of population genetic studies utilize nuclear DNA microsatellite data from museum specimens and noninvasive sources. Genotyping errors are elevated in these low quantity DNA sources, potentially compromising the power and accuracy of the data. The most conservative method for addressing this problem is effective, but requires extensive replication of individual genotypes. In se...
متن کاملAllelic diversity and association analysis for grain quality traits in exotic rice genotypes
The present research aims to study the association and allelic diversity of linked microsatellite markers to grain quality QTLs of 84 exotic rice genotypes. To this end, 9 microsatellite markers (RM540, RM539, RM587, RM527, RM216, RM467, RM3188, RM246, RM5461) were used in which a total of 61 alleles were identified with a mean of 6 alleles per locus. The polymorphism information content (PIC) ...
متن کاملSoftware for Quantifying and Simulating Microsatellite Genotyping Error
Microsatellite genetic marker data are exploited in a variety of fields, including forensics, gene mapping, kinship inference and population genetics. In all of these fields, inference can be thwarted by failure to quantify and account for data errors, and kinship inference in particular can benefit from separating errors into two distinct classes: allelic dropout and false alleles. Pedant is M...
متن کاملGenetic Diversity of Iranian and Some of European Grapesrevealed by Microsatellite Markers
In order to characterize Iranian grape (Vitis vinifera L.) germplasm, 136 genotypes were collected from five grape growing regions (Azarbaijan, Qazvin, Kordestan, Khorasan and Fars) and genotyped along with 36 European cultivars using 9 sequence tagged microsatellite sites (STMS) markers. The used set of markers could distinguish all 172 genotypes under study. Altogether 84 polymorphic alleles ...
متن کامل